48 research outputs found

    Special purpose parallel computer architecture for real-time control and simulation in robotic applications

    Get PDF
    This is a real-time robotic controller and simulator which is a MIMD-SIMD parallel architecture for interfacing with an external host computer and providing a high degree of parallelism in computations for robotic control and simulation. It includes a host processor for receiving instructions from the external host computer and for transmitting answers to the external host computer. There are a plurality of SIMD microprocessors, each SIMD processor being a SIMD parallel processor capable of exploiting fine grain parallelism and further being able to operate asynchronously to form a MIMD architecture. Each SIMD processor comprises a SIMD architecture capable of performing two matrix-vector operations in parallel while fully exploiting parallelism in each operation. There is a system bus connecting the host processor to the plurality of SIMD microprocessors and a common clock providing a continuous sequence of clock pulses. There is also a ring structure interconnecting the plurality of SIMD microprocessors and connected to the clock for providing the clock pulses to the SIMD microprocessors and for providing a path for the flow of data and instructions between the SIMD microprocessors. The host processor includes logic for controlling the RRCS by interpreting instructions sent by the external host computer, decomposing the instructions into a series of computations to be performed by the SIMD microprocessors, using the system bus to distribute associated data among the SIMD microprocessors, and initiating activity of the SIMD microprocessors to perform the computations on the data by procedure call

    A unifying framework for rigid multibody dynamics and serial and parallel computational issues

    Get PDF
    A unifying framework for various formulations of the dynamics of open-chain rigid multibody systems is discussed. Their suitability for serial and parallel processing is assessed. The framework is based on the derivation of intrinsic, i.e., coordinate-free, equations of the algorithms which provides a suitable abstraction and permits a distinction to be made between the computational redundancy in the intrinsic and extrinsic equations. A set of spatial notation is used which allows the derivation of the various algorithms in a common setting and thus clarifies the relationships among them. The three classes of algorithms viz., O(n), O(n exp 2) and O(n exp 3) or the solution of the dynamics problem are investigated. Researchers begin with the derivation of O(n exp 3) algorithms based on the explicit computation of the mass matrix and it provides insight into the underlying basis of the O(n) algorithms. From a computational perspective, the optimal choice of a coordinate frame for the projection of the intrinsic equations is discussed and the serial computational complexity of the different algorithms is evaluated. The three classes of algorithms are also analyzed for suitability for parallel processing. It is shown that the problem belongs to the class of N C and the time and processor bounds are of O(log2/2(n)) and O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 2) processors, and results from the parallelization of the O(n exp 3) serial algorithm

    Parallel O(log n) algorithms for open- and closed-chain rigid multibody systems based on a new mass matrix factorization technique

    Get PDF
    In this paper, parallel O(log n) algorithms for computation of rigid multibody dynamics are developed. These parallel algorithms are derived by parallelization of new O(n) algorithms for the problem. The underlying feature of these O(n) algorithms is a drastically different strategy for decomposition of interbody force which leads to a new factorization of the mass matrix (M). Specifically, it is shown that a factorization of the inverse of the mass matrix in the form of the Schur Complement is derived as M(exp -1) = C - B(exp *)A(exp -1)B, wherein matrices C, A, and B are block tridiagonal matrices. The new O(n) algorithm is then derived as a recursive implementation of this factorization of M(exp -1). For the closed-chain systems, similar factorizations and O(n) algorithms for computation of Operational Space Mass Matrix lambda and its inverse lambda(exp -1) are also derived. It is shown that these O(n) algorithms are strictly parallel, that is, they are less efficient than other algorithms for serial computation of the problem. But, to our knowledge, they are the only known algorithms that can be parallelized and that lead to both time- and processor-optimal parallel algorithms for the problem, i.e., parallel O(log n) algorithms with O(n) processors. The developed parallel algorithms, in addition to their theoretical significance, are also practical from an implementation point of view due to their simple architectural requirements

    Serial and parallel computation of Kane's equations for multibody dynamics

    Get PDF
    The analysis of the efficiency of algorithms resulting from Kane's Equation for serial and parallel computation of mass matrix is examined. The algorithms resulting from Kane's equation and Modified Kane's equations are detailed. An analysis was made of two classes of algorithms for computation of mass matrix: the Newton-Euler based algorithms and the Composite rigid body algorithms. An analysis was also made of the efficiency of different algorithms for serial and parallel computations. Conclusions are drawn and presented

    Highly parallel computer architecture for robotic computation

    Get PDF
    In a computer having a large number of single instruction multiple data (SIMD) processors, each of the SIMD processors has two sets of three individual processor elements controlled by a master control unit and interconnected among a plurality of register file units where data is stored. The register files input and output data in synchronism with a minor cycle clock under control of two slave control units controlling the register file units connected to respective ones of the two sets of processor elements. Depending upon which ones of the register file units are enabled to store or transmit data during a particular minor clock cycle, the processor elements within an SIMD processor are connected in rings or in pipeline arrays, and may exchange data with the internal bus or with neighboring SIMD processors through interface units controlled by respective ones of the two slave control units

    Parallel algorithms and architecture for computation of manipulator forward dynamics

    Get PDF
    Parallel computation of manipulator forward dynamics is investigated. Considering three classes of algorithms for the solution of the problem, that is, the O(n), the O(n exp 2), and the O(n exp 3) algorithms, parallelism in the problem is analyzed. It is shown that the problem belongs to the class of NC and that the time and processors bounds are of O(log2/2n) and O(n exp 4), respectively. However, the fastest stable parallel algorithms achieve the computation time of O(n) and can be derived by parallelization of the O(n exp 3) serial algorithms. Parallel computation of the O(n exp 3) algorithms requires the development of parallel algorithms for a set of fundamentally different problems, that is, the Newton-Euler formulation, the computation of the inertia matrix, decomposition of the symmetric, positive definite matrix, and the solution of triangular systems. Parallel algorithms for this set of problems are developed which can be efficiently implemented on a unique architecture, a triangular array of n(n+2)/2 processors with a simple nearest-neighbor interconnection. This architecture is particularly suitable for VLSI and WSI implementations. The developed parallel algorithm, compared to the best serial O(n) algorithm, achieves an asymptotic speedup of more than two orders-of-magnitude in the computation the forward dynamics

    System for solving diagnosis and hitting set problems

    Get PDF
    The diagnosis problem arises when a system's actual behavior contradicts the expected behavior, thereby exhibiting symptoms (a collection of conflict sets). System diagnosis is then the task of identifying faulty components that are responsible for anomalous behavior. To solve the diagnosis problem, the present invention describes a method for finding the minimal set of faulty components (minimal diagnosis set) that explain the conflict sets. The method includes acts of creating a matrix of the collection of conflict sets, and then creating nodes from the matrix such that each node is a node in a search tree. A determination is made as to whether each node is a leaf node or has any children nodes. If any given node has children nodes, then the node is split until all nodes are leaf nodes. Information gathered from the leaf nodes is used to determine the minimal diagnosis set

    High-Performance Algorithm for Solving the Diagnosis Problem

    Get PDF
    An improved method of model-based diagnosis of a complex engineering system is embodied in an algorithm that involves considerably less computation than do prior such algorithms. This method and algorithm are based largely on developments reported in several NASA Tech Briefs articles: The Complexity of the Diagnosis Problem (NPO-30315), Vol. 26, No. 4 (April 2002), page 20; Fast Algorithms for Model-Based Diagnosis (NPO-30582), Vol. 29, No. 3 (March 2005), page 69; Two Methods of Efficient Solution of the Hitting-Set Problem (NPO-30584), Vol. 29, No. 3 (March 2005), page 73; and Efficient Model-Based Diagnosis Engine (NPO-40544), on the following page. Some background information from the cited articles is prerequisite to a meaningful summary of the innovative aspects of the present method and algorithm. In model-based diagnosis, the function of each component and the relationships among all the components of the engineering system to be diagnosed are represented as a logical system denoted the system description (SD). Hence, the expected normal behavior of the engineering system is the set of logical consequences of the SD. Faulty components lead to inconsistencies between the observed behaviors of the system and the SD. Diagnosis the task of finding faulty components is reduced to finding those components, the abnormalities of which could explain all the inconsistencies. The solution of the diagnosis problem should be a minimal diagnosis, which is a minimal set of faulty components. The calculation of a minimal diagnosis is inherently a hard problem, the solution of which requires amounts of computation time and memory that increase exponentially with the number of components of the engineering system. Among the developments to reduce the computational burden, as reported in the cited articles, is the mapping of the diagnosis problem onto the integer-programming (IP) problem. This mapping makes it possible to utilize a variety of algorithms developed previously for IP to solve the diagnosis problem. In the IP approach, the diagnosis problem can be formulated as a linear integer optimization problem, which can be solved by use of well-developed integer-programming algorithms. This concludes the background information

    Efficient Multiplexer FPGA Block Structures Based on G4FETs

    Get PDF
    Generic structures have been conceived for multiplexer blocks to be implemented in field-programmable gate arrays (FPGAs) based on four-gate field-effect transistors (G(sup 4)FETs). This concept is a contribution to the continuing development of digital logic circuits based on G4FETs and serves as a further demonstration that logic circuits based on G(sup 4)FETs could be more efficient (in the sense that they could contain fewer transistors), relative to functionally equivalent logic circuits based on conventional transistors. Results in this line of development at earlier stages were summarized in two previous NASA Tech Briefs articles: "G(sup 4)FETs as Universal and Programmable Logic Gates" (NPO-41698), Vol. 31, No. 7 (July 2007), page 44, and "Efficient G4FET-Based Logic Circuits" (NPO-44407), Vol. 32, No. 1 ( January 2008), page 38 . As described in the first-mentioned previous article, a G4FET can be made to function as a three-input NOT-majority gate, which has been shown to be a universal and programmable logic gate. The universality and programmability could be exploited to design logic circuits containing fewer components than are required for conventional transistor-based circuits performing the same logic functions. The second-mentioned previous article reported results of a comparative study of NOT-majority-gate (G(sup 4)FET)-based logic-circuit designs and equivalent NOR- and NAND-gate-based designs utilizing conventional transistors. [NOT gates (inverters) were also included, as needed, in both the G(sup 4)FET- and the NOR- and NAND-based designs.] In most of the cases studied, fewer logic gates (and, hence, fewer transistors), were required in the G(sup 4)FET-based designs. There are two popular categories of FPGA block structures or architectures: one based on multiplexers, the other based on lookup tables. In standard multiplexer- based architectures, the basic building block is a tree-like configuration of multiplexers, with possibly a few additional logic gates such as ANDs or ORs. Interconnections are realized by means of programmable switches that may connect the input terminals of a block to output terminals of other blocks, may bridge together some of the inputs, or may connect some of the input terminals to signal sources representing constant logical levels 0 or 1. The left part of the figure depicts a four-to-one G(sup 4)FET-based multiplexer tree; the right part of the figure depicts a functionally equivalent four-to-one multiplexer based on conventional transistors. The G(sup 4)FET version would contains 54 transistors; the conventional version contains 70 transistors

    An Efficient Reachability Analysis Algorithm

    Get PDF
    A document discusses a new algorithm for generating higher-order dependencies for diagnostic and sensor placement analysis when a system is described with a causal modeling framework. This innovation will be used in diagnostic and sensor optimization and analysis tools. Fault detection, diagnosis, and prognosis are essential tasks in the operation of autonomous spacecraft, instruments, and in-situ platforms. This algorithm will serve as a power tool for technologies that satisfy a key requirement of autonomous spacecraft, including science instruments and in-situ missions
    corecore